Dataset statistics
| Number of variables | 10 |
|---|---|
| Number of observations | 768 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 60.1 KiB |
| Average record size in memory | 80.2 B |
Variable types
| Numeric | 6 |
|---|---|
| Categorical | 4 |
relative compactness is highly correlated with surface area and 4 other fields | High correlation |
surface area is highly correlated with relative compactness and 4 other fields | High correlation |
roof area is highly correlated with relative compactness and 4 other fields | High correlation |
overall height is highly correlated with relative compactness and 4 other fields | High correlation |
heating load is highly correlated with relative compactness and 4 other fields | High correlation |
cooling load is highly correlated with relative compactness and 4 other fields | High correlation |
relative compactness is highly correlated with surface area and 4 other fields | High correlation |
surface area is highly correlated with relative compactness and 4 other fields | High correlation |
roof area is highly correlated with relative compactness and 4 other fields | High correlation |
overall height is highly correlated with relative compactness and 4 other fields | High correlation |
heating load is highly correlated with relative compactness and 4 other fields | High correlation |
cooling load is highly correlated with relative compactness and 4 other fields | High correlation |
relative compactness is highly correlated with surface area and 2 other fields | High correlation |
surface area is highly correlated with relative compactness and 2 other fields | High correlation |
roof area is highly correlated with relative compactness and 4 other fields | High correlation |
overall height is highly correlated with relative compactness and 4 other fields | High correlation |
heating load is highly correlated with roof area and 2 other fields | High correlation |
cooling load is highly correlated with roof area and 2 other fields | High correlation |
glazing area distribution is highly correlated with glazing area | High correlation |
overall height is highly correlated with surface area and 5 other fields | High correlation |
surface area is highly correlated with overall height and 5 other fields | High correlation |
roof area is highly correlated with overall height and 5 other fields | High correlation |
heating load is highly correlated with overall height and 6 other fields | High correlation |
wall area is highly correlated with overall height and 5 other fields | High correlation |
glazing area is highly correlated with glazing area distribution and 2 other fields | High correlation |
cooling load is highly correlated with overall height and 6 other fields | High correlation |
relative compactness is highly correlated with overall height and 5 other fields | High correlation |
roof area is highly correlated with overall height | High correlation |
overall height is highly correlated with roof area | High correlation |
overall height is uniformly distributed | Uniform |
orientation is uniformly distributed | Uniform |
glazing area distribution has 48 (6.2%) zeros | Zeros |
Reproduction
| Analysis started | 2021-09-07 07:21:09.548927 |
|---|---|
| Analysis finished | 2021-09-07 07:21:26.089497 |
| Duration | 16.54 seconds |
| Software version | pandas-profiling v3.0.0 |
| Download configuration | config.json |
relative compactness
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 12 |
|---|---|
| Distinct (%) | 1.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.7641666681 |
| Minimum | 0.6200000048 |
|---|---|
| Maximum | 0.9800000191 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.1 KiB |
Quantile statistics
| Minimum | 0.6200000048 |
|---|---|
| 5-th percentile | 0.6200000048 |
| Q1 | 0.6825000048 |
| median | 0.75 |
| Q3 | 0.8299999982 |
| 95-th percentile | 0.9800000191 |
| Maximum | 0.9800000191 |
| Range | 0.3600000143 |
| Interquartile range (IQR) | 0.1474999934 |
Descriptive statistics
| Standard deviation | 0.1057774774 |
|---|---|
| Coefficient of variation (CV) | 0.1384219985 |
| Kurtosis | -0.7065673466 |
| Mean | 0.7641666681 |
| Median Absolute Deviation (MAD) | 0.07999998331 |
| Skewness | 0.495512586 |
| Sum | 586.8800011 |
| Variance | 0.01118887472 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=12)
| Value | Count | Frequency (%) |
| 0.7900000215 | 64 | |
| 0.6200000048 | 64 | |
| 0.6600000262 | 64 | |
| 0.7400000095 | 64 | |
| 0.8199999928 | 64 | |
| 0.8600000143 | 64 | |
| 0.8999999762 | 64 | |
| 0.6899999976 | 64 | |
| 0.9800000191 | 64 | |
| 0.6399999857 | 64 | |
| Other values (2) | 128 |
| Value | Count | Frequency (%) |
| 0.6200000048 | 64 | |
| 0.6399999857 | 64 | |
| 0.6600000262 | 64 | |
| 0.6899999976 | 64 | |
| 0.7099999785 | 64 | |
| 0.7400000095 | 64 | |
| 0.7599999905 | 64 | |
| 0.7900000215 | 64 | |
| 0.8199999928 | 64 | |
| 0.8600000143 | 64 |
| Value | Count | Frequency (%) |
| 0.9800000191 | 64 | |
| 0.8999999762 | 64 | |
| 0.8600000143 | 64 | |
| 0.8199999928 | 64 | |
| 0.7900000215 | 64 | |
| 0.7599999905 | 64 | |
| 0.7400000095 | 64 | |
| 0.7099999785 | 64 | |
| 0.6899999976 | 64 | |
| 0.6600000262 | 64 |
| Distinct | 12 |
|---|---|
| Distinct (%) | 1.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 671.7083333 |
| Minimum | 514.5 |
|---|---|
| Maximum | 808.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.1 KiB |
Quantile statistics
| Minimum | 514.5 |
|---|---|
| 5-th percentile | 514.5 |
| Q1 | 606.375 |
| median | 673.75 |
| Q3 | 741.125 |
| 95-th percentile | 808.5 |
| Maximum | 808.5 |
| Range | 294 |
| Interquartile range (IQR) | 134.75 |
Descriptive statistics
| Standard deviation | 88.08611606 |
|---|---|
| Coefficient of variation (CV) | 0.1311374471 |
| Kurtosis | -1.059454167 |
| Mean | 671.7083333 |
| Median Absolute Deviation (MAD) | 73.5 |
| Skewness | -0.1251308847 |
| Sum | 515872 |
| Variance | 7759.163842 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=12)
| Value | Count | Frequency (%) |
| 563.5 | 64 | |
| 735 | 64 | |
| 686 | 64 | |
| 637 | 64 | |
| 808.5 | 64 | |
| 514.5 | 64 | |
| 759.5 | 64 | |
| 710.5 | 64 | |
| 661.5 | 64 | |
| 612.5 | 64 | |
| Other values (2) | 128 |
| Value | Count | Frequency (%) |
| 514.5 | 64 | |
| 563.5 | 64 | |
| 588 | 64 | |
| 612.5 | 64 | |
| 637 | 64 | |
| 661.5 | 64 | |
| 686 | 64 | |
| 710.5 | 64 | |
| 735 | 64 | |
| 759.5 | 64 |
| Value | Count | Frequency (%) |
| 808.5 | 64 | |
| 784 | 64 | |
| 759.5 | 64 | |
| 735 | 64 | |
| 710.5 | 64 | |
| 686 | 64 | |
| 661.5 | 64 | |
| 637 | 64 | |
| 612.5 | 64 | |
| 588 | 64 |
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 318.5 |
| Minimum | 245 |
|---|---|
| Maximum | 416.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.1 KiB |
Quantile statistics
| Minimum | 245 |
|---|---|
| 5-th percentile | 245 |
| Q1 | 294 |
| median | 318.5 |
| Q3 | 343 |
| 95-th percentile | 416.5 |
| Maximum | 416.5 |
| Range | 171.5 |
| Interquartile range (IQR) | 49 |
Descriptive statistics
| Standard deviation | 43.62648144 |
|---|---|
| Coefficient of variation (CV) | 0.136974824 |
| Kurtosis | 0.11659327 |
| Mean | 318.5 |
| Median Absolute Deviation (MAD) | 24.5 |
| Skewness | 0.5334174897 |
| Sum | 244608 |
| Variance | 1903.269883 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=7)
| Value | Count | Frequency (%) |
| 318.5 | 192 | |
| 294 | 192 | |
| 343 | 128 | |
| 367.5 | 64 | 8.3% |
| 245 | 64 | 8.3% |
| 269.5 | 64 | 8.3% |
| 416.5 | 64 | 8.3% |
| Value | Count | Frequency (%) |
| 245 | 64 | 8.3% |
| 269.5 | 64 | 8.3% |
| 294 | 192 | |
| 318.5 | 192 | |
| 343 | 128 | |
| 367.5 | 64 | 8.3% |
| 416.5 | 64 | 8.3% |
| Value | Count | Frequency (%) |
| 416.5 | 64 | 8.3% |
| 367.5 | 64 | 8.3% |
| 343 | 128 | |
| 318.5 | 192 | |
| 294 | 192 | |
| 269.5 | 64 | 8.3% |
| 245 | 64 | 8.3% |
roof area
Categorical
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 4 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.1 KiB |
| 220.5 | |
|---|---|
| 147.0 | |
| 122.5 | |
| 110.25 |
Length
| Max length | 6 |
|---|---|
| Median length | 5 |
| Mean length | 5.083333333 |
| Min length | 5 |
Characters and Unicode
| Total characters | 3904 |
|---|---|
| Distinct characters | 7 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 110.25 |
|---|---|
| 2nd row | 110.25 |
| 3rd row | 110.25 |
| 4th row | 110.25 |
| 5th row | 122.5 |
Common Values
| Value | Count | Frequency (%) |
| 220.5 | 384 | |
| 147.0 | 192 | |
| 122.5 | 128 | 16.7% |
| 110.25 | 64 | 8.3% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 220.5 | 384 | |
| 147.0 | 192 | |
| 122.5 | 128 | 16.7% |
| 110.25 | 64 | 8.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 1088 | |
| . | 768 | |
| 0 | 640 | |
| 5 | 576 | |
| 1 | 448 | |
| 4 | 192 | 4.9% |
| 7 | 192 | 4.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 3136 | |
| Other Punctuation | 768 | 19.7% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 1088 | |
| 0 | 640 | |
| 5 | 576 | |
| 1 | 448 | |
| 4 | 192 | 6.1% |
| 7 | 192 | 6.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 768 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3904 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 1088 | |
| . | 768 | |
| 0 | 640 | |
| 5 | 576 | |
| 1 | 448 | |
| 4 | 192 | 4.9% |
| 7 | 192 | 4.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3904 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 1088 | |
| . | 768 | |
| 0 | 640 | |
| 5 | 576 | |
| 1 | 448 | |
| 4 | 192 | 4.9% |
| 7 | 192 | 4.9% |
overall height
Categorical
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONUNIFORM| Distinct | 2 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.1 KiB |
| 3.5 | |
|---|---|
| 7.0 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 2304 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 7.0 |
|---|---|
| 2nd row | 7.0 |
| 3rd row | 7.0 |
| 4th row | 7.0 |
| 5th row | 7.0 |
Common Values
| Value | Count | Frequency (%) |
| 3.5 | 384 | |
| 7.0 | 384 |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 7.0 | 384 | |
| 3.5 | 384 |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 768 | |
| 7 | 384 | |
| 0 | 384 | |
| 3 | 384 | |
| 5 | 384 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1536 | |
| Other Punctuation | 768 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 7 | 384 | |
| 0 | 384 | |
| 3 | 384 | |
| 5 | 384 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 768 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2304 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 768 | |
| 7 | 384 | |
| 0 | 384 | |
| 3 | 384 | |
| 5 | 384 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2304 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 768 | |
| 7 | 384 | |
| 0 | 384 | |
| 3 | 384 | |
| 5 | 384 |
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.1 KiB |
| 2 | |
|---|---|
| 5 | |
| 3 | |
| 4 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 768 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2 |
|---|---|
| 2nd row | 3 |
| 3rd row | 4 |
| 4th row | 5 |
| 5th row | 2 |
Common Values
| Value | Count | Frequency (%) |
| 2 | 192 | |
| 5 | 192 | |
| 3 | 192 | |
| 4 | 192 |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 4 | 192 | |
| 3 | 192 | |
| 5 | 192 | |
| 2 | 192 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 192 | |
| 3 | 192 | |
| 4 | 192 | |
| 5 | 192 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 768 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 192 | |
| 3 | 192 | |
| 4 | 192 | |
| 5 | 192 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 768 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 192 | |
| 3 | 192 | |
| 4 | 192 | |
| 5 | 192 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 768 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 192 | |
| 3 | 192 | |
| 4 | 192 | |
| 5 | 192 |
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.1 KiB |
| 0.10000000149011612 | |
|---|---|
| 0.25 | |
| 0.4000000059604645 | |
| 0.0 |
Length
| Max length | 19 |
|---|---|
| Median length | 18 |
| Mean length | 13 |
| Min length | 3 |
Characters and Unicode
| Total characters | 9984 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.10000000149011612 | 240 | |
| 0.25 | 240 | |
| 0.4000000059604645 | 240 | |
| 0.0 | 48 | 6.2% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 0.4000000059604645 | 240 | |
| 0.25 | 240 | |
| 0.10000000149011612 | 240 | |
| 0.0 | 48 | 6.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 4656 | |
| 1 | 1200 | 12.0% |
| 4 | 960 | 9.6% |
| . | 768 | 7.7% |
| 6 | 720 | 7.2% |
| 5 | 720 | 7.2% |
| 9 | 480 | 4.8% |
| 2 | 480 | 4.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 9216 | |
| Other Punctuation | 768 | 7.7% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 4656 | |
| 1 | 1200 | 13.0% |
| 4 | 960 | 10.4% |
| 6 | 720 | 7.8% |
| 5 | 720 | 7.8% |
| 9 | 480 | 5.2% |
| 2 | 480 | 5.2% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 768 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 9984 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 4656 | |
| 1 | 1200 | 12.0% |
| 4 | 960 | 9.6% |
| . | 768 | 7.7% |
| 6 | 720 | 7.2% |
| 5 | 720 | 7.2% |
| 9 | 480 | 4.8% |
| 2 | 480 | 4.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9984 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 4656 | |
| 1 | 1200 | 12.0% |
| 4 | 960 | 9.6% |
| . | 768 | 7.7% |
| 6 | 720 | 7.2% |
| 5 | 720 | 7.2% |
| 9 | 480 | 4.8% |
| 2 | 480 | 4.8% |
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.8125 |
| Minimum | 0 |
|---|---|
| Maximum | 5 |
| Zeros | 48 |
| Zeros (%) | 6.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1.75 |
| median | 3 |
| Q3 | 4 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 2.25 |
Descriptive statistics
| Standard deviation | 1.550959664 |
|---|---|
| Coefficient of variation (CV) | 0.5514523251 |
| Kurtosis | -1.148708815 |
| Mean | 2.8125 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -0.08868917544 |
| Sum | 2160 |
| Variance | 2.40547588 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=6)
| Value | Count | Frequency (%) |
| 5 | 144 | |
| 4 | 144 | |
| 3 | 144 | |
| 2 | 144 | |
| 1 | 144 | |
| 0 | 48 | 6.2% |
| Value | Count | Frequency (%) |
| 0 | 48 | 6.2% |
| 1 | 144 | |
| 2 | 144 | |
| 3 | 144 | |
| 4 | 144 | |
| 5 | 144 |
| Value | Count | Frequency (%) |
| 5 | 144 | |
| 4 | 144 | |
| 3 | 144 | |
| 2 | 144 | |
| 1 | 144 | |
| 0 | 48 | 6.2% |
| Distinct | 586 |
|---|---|
| Distinct (%) | 76.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 22.30720054 |
| Minimum | 6.010000229 |
|---|---|
| Maximum | 43.09999847 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.1 KiB |
Quantile statistics
| Minimum | 6.010000229 |
|---|---|
| 5-th percentile | 10.46350012 |
| Q1 | 12.99250007 |
| median | 18.94999981 |
| Q3 | 31.66750002 |
| 95-th percentile | 39.86000061 |
| Maximum | 43.09999847 |
| Range | 37.08999825 |
| Interquartile range (IQR) | 18.67499995 |
Descriptive statistics
| Standard deviation | 10.09019575 |
|---|---|
| Coefficient of variation (CV) | 0.4523290913 |
| Kurtosis | -1.245571862 |
| Mean | 22.30720054 |
| Median Absolute Deviation (MAD) | 7.514999866 |
| Skewness | 0.3604488848 |
| Sum | 17131.93002 |
| Variance | 101.8120503 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 15.15999985 | 6 | 0.8% |
| 13 | 5 | 0.7% |
| 15.22999954 | 4 | 0.5% |
| 14.60000038 | 4 | 0.5% |
| 32.31000137 | 4 | 0.5% |
| 28.14999962 | 4 | 0.5% |
| 12.93000031 | 4 | 0.5% |
| 15.09000015 | 4 | 0.5% |
| 15.55000019 | 4 | 0.5% |
| 10.68000031 | 4 | 0.5% |
| Other values (576) | 725 |
| Value | Count | Frequency (%) |
| 6.010000229 | 1 | |
| 6.039999962 | 1 | |
| 6.050000191 | 1 | |
| 6.070000172 | 1 | |
| 6.369999886 | 2 | |
| 6.400000095 | 2 | |
| 6.769999981 | 1 | |
| 6.789999962 | 1 | |
| 6.809999943 | 1 | |
| 6.849999905 | 1 |
| Value | Count | Frequency (%) |
| 43.09999847 | 1 | |
| 42.95999908 | 1 | |
| 42.77000046 | 1 | |
| 42.74000168 | 1 | |
| 42.61999893 | 1 | |
| 42.5 | 1 | |
| 42.49000168 | 1 | |
| 42.11000061 | 1 | |
| 42.08000183 | 1 | |
| 41.95999908 | 1 |
| Distinct | 636 |
|---|---|
| Distinct (%) | 82.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 24.58776039 |
| Minimum | 10.89999962 |
|---|---|
| Maximum | 48.02999878 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.1 KiB |
Quantile statistics
| Minimum | 10.89999962 |
|---|---|
| 5-th percentile | 13.61750011 |
| Q1 | 15.62000012 |
| median | 22.07999992 |
| Q3 | 33.13250065 |
| 95-th percentile | 40.03699837 |
| Maximum | 48.02999878 |
| Range | 37.12999916 |
| Interquartile range (IQR) | 17.51250052 |
Descriptive statistics
| Standard deviation | 9.513305495 |
|---|---|
| Coefficient of variation (CV) | 0.3869122419 |
| Kurtosis | -1.147190359 |
| Mean | 24.58776039 |
| Median Absolute Deviation (MAD) | 7.540000439 |
| Skewness | 0.3959924526 |
| Sum | 18883.39998 |
| Variance | 90.50298144 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 14.27999973 | 4 | 0.5% |
| 14.27000046 | 4 | 0.5% |
| 29.79000092 | 4 | 0.5% |
| 17.20000076 | 4 | 0.5% |
| 21.32999992 | 4 | 0.5% |
| 14.67000008 | 3 | 0.4% |
| 15.85000038 | 3 | 0.4% |
| 32.83000183 | 3 | 0.4% |
| 15.43999958 | 3 | 0.4% |
| 13.72000027 | 3 | 0.4% |
| Other values (626) | 733 |
| Value | Count | Frequency (%) |
| 10.89999962 | 1 | |
| 10.93999958 | 1 | |
| 11.17000008 | 1 | |
| 11.18999958 | 1 | |
| 11.27000046 | 1 | |
| 11.28999996 | 1 | |
| 11.67000008 | 1 | |
| 11.72000027 | 1 | |
| 11.72999954 | 1 | |
| 11.73999977 | 1 |
| Value | Count | Frequency (%) |
| 48.02999878 | 1 | |
| 47.59000015 | 1 | |
| 47.00999832 | 1 | |
| 46.93999863 | 1 | |
| 46.43999863 | 1 | |
| 46.22999954 | 1 | |
| 45.97000122 | 1 | |
| 45.59000015 | 1 | |
| 45.52000046 | 1 | |
| 45.47999954 | 1 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| relative compactness | surface area | wall area | roof area | overall height | orientation | glazing area | glazing area distribution | heating load | cooling load | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.98 | 514.5 | 294.0 | 110.25 | 7.0 | 2 | 0.0 | 0 | 15.550000 | 21.330000 |
| 1 | 0.98 | 514.5 | 294.0 | 110.25 | 7.0 | 3 | 0.0 | 0 | 15.550000 | 21.330000 |
| 2 | 0.98 | 514.5 | 294.0 | 110.25 | 7.0 | 4 | 0.0 | 0 | 15.550000 | 21.330000 |
| 3 | 0.98 | 514.5 | 294.0 | 110.25 | 7.0 | 5 | 0.0 | 0 | 15.550000 | 21.330000 |
| 4 | 0.90 | 563.5 | 318.5 | 122.50 | 7.0 | 2 | 0.0 | 0 | 20.840000 | 28.280001 |
| 5 | 0.90 | 563.5 | 318.5 | 122.50 | 7.0 | 3 | 0.0 | 0 | 21.459999 | 25.379999 |
| 6 | 0.90 | 563.5 | 318.5 | 122.50 | 7.0 | 4 | 0.0 | 0 | 20.709999 | 25.160000 |
| 7 | 0.90 | 563.5 | 318.5 | 122.50 | 7.0 | 5 | 0.0 | 0 | 19.680000 | 29.600000 |
| 8 | 0.86 | 588.0 | 294.0 | 147.00 | 7.0 | 2 | 0.0 | 0 | 19.500000 | 27.299999 |
| 9 | 0.86 | 588.0 | 294.0 | 147.00 | 7.0 | 3 | 0.0 | 0 | 19.950001 | 21.969999 |
Last rows
| relative compactness | surface area | wall area | roof area | overall height | orientation | glazing area | glazing area distribution | heating load | cooling load | |
|---|---|---|---|---|---|---|---|---|---|---|
| 758 | 0.66 | 759.5 | 318.5 | 220.5 | 3.5 | 4 | 0.4 | 5 | 14.920000 | 17.549999 |
| 759 | 0.66 | 759.5 | 318.5 | 220.5 | 3.5 | 5 | 0.4 | 5 | 15.160000 | 18.059999 |
| 760 | 0.64 | 784.0 | 343.0 | 220.5 | 3.5 | 2 | 0.4 | 5 | 17.690001 | 20.820000 |
| 761 | 0.64 | 784.0 | 343.0 | 220.5 | 3.5 | 3 | 0.4 | 5 | 18.190001 | 20.209999 |
| 762 | 0.64 | 784.0 | 343.0 | 220.5 | 3.5 | 4 | 0.4 | 5 | 18.160000 | 20.709999 |
| 763 | 0.64 | 784.0 | 343.0 | 220.5 | 3.5 | 5 | 0.4 | 5 | 17.879999 | 21.400000 |
| 764 | 0.62 | 808.5 | 367.5 | 220.5 | 3.5 | 2 | 0.4 | 5 | 16.540001 | 16.879999 |
| 765 | 0.62 | 808.5 | 367.5 | 220.5 | 3.5 | 3 | 0.4 | 5 | 16.440001 | 17.110001 |
| 766 | 0.62 | 808.5 | 367.5 | 220.5 | 3.5 | 4 | 0.4 | 5 | 16.480000 | 16.610001 |
| 767 | 0.62 | 808.5 | 367.5 | 220.5 | 3.5 | 5 | 0.4 | 5 | 16.639999 | 16.030001 |